-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libc] Enhance debugging output for amalloc and dmalloc #2144
Conversation
btw, sysctl malloc.debug=2 is nice! [EDIT]: I forgot to recompile after this commit - doing it now |
Ok. Does setting sysctl still crash the system? If so, does this happen after any particular toolchain program, or just one of them? |
I'm not sure what I did, but after system compilation, no system crash. [EDIT: managed to reproduce. Just start compilation, ctl+c and umount /mnt] |
We're probably experiencing memory corruption. "START" is the first thing the kernel displays, that was put in there to show the kernel is restarting, which should never happen and is usually the result of a CALL 0 or something like that. To debug this, we can either 1) increase memory arena to see if that changes anything, 2) don't do ^C to see if the problem has to do only with signal handling (this is a definite possibility, perhaps the signal handling in make86 is causing corruption, and 3) see whether this happens using a shall script like ecc running the compilations rather than make. We also need to try to find which tool is causing the corruption. MAKE is not running OWC so its not running the new arena code. I will put together a wrapper for it that will cause it to run the debug v7 allocator It is nice you have a repeatable test case. Umount doesn't do anything special, that's probably just the area of code that got corrupted that ends up causing the kernel crash. |
A potential issue with large model is that all pointers are far and can point to anything. So a bad pointer can easily scribble onto kernel code or data. Thus I have been working on even tighter restrictions and better checking for our memory management routines, but any internal bug in the program itself can also show up as a kernel crash. |
I can confirm it has nothing to do with the printfs. After I Ctrl+C nasm and try to umount the floppy, I see the same crash. [EDIT: Ctrl+C was not needed] |
So, it was nasm destroying the memory. Fixed: |
That's very good to know, I'll cross off worrying about possible undefined behavior occurring during a signal callback.
Well, I am glad you have this working again at 32K heap rather than 24K. But IMO, this isn't likely to fix the problem, as it seems the problem is a stray pointer, very possibly and most probably a bug in NASM. Changing the max heap just shifts around pointer values, so it could very easily be that NASM is now just writing somewhere else in memory that we can't see. Of course, it is possible that arena malloc is the culprit, but CPP and C86 are running fine, and arena malloc was meticulously crafted from the already working V7 malloc. Until we can get the crash to repeat in NASM on host, it will likely be hard to debug. However, after @toncho11's testing on real 8088 hardware, it's becoming pretty obvious that NASM isn't ever going to work as our core assembler - its way too big, way too slow, and possibly buggy. I think we should try to find another assembler that takes very similar to NASM-style (or close, possibly MASM) assembler input and switch completely to that. I plan on making C86 work with dual ASM output, both NASM and AS86 format, so that we can in the interim switch to AS86 for the devkit. I hope to get that working while you're on holiday. After seeing the results of NASM on both your system and @toncho11's, getting completely rid of NASM, the sooner the better! The memory overwrite problem could easily come back at us at any time, and our entire toolchain would be in trouble. Of course, it just isn't fast enough on any real hardware anyways. |
This PR allows for better understanding of debug output from the arena and debug malloc allocators, and new debug output was added to fmemalloc.
Now, when running
sysctl malloc.debug=2
before the toolchain, a one-line summary is displayed when malloc, free or fmemalloc is called. This information has been used to tune the 8086 toolchain programs.For instance, the following is output when CPP86 is run:
This shows the single fmemalloc for the arena malloc init, followed by each of the malloc/free's that CPP86 performs. It also shows the maximum use of the arena heap is 3348 bytes out of 16000. (I have lowered this from 64K significantly as a result). CPP86 does not use any fmemalloc calls from the arena malloc wrapper.
Here is the result from running C86:
Note only 3290 bytes of 16000 from arena heap, and two separate fmemallocs of 3076 bytes.
NASM86, on the other hand, ends up making thousands of malloc/free requests, all small, and can run in as little as 8K arena heap. However, it also makes tons of fmemalloc calls of large sizes. Nonetheless, I have used this information to reduce the memory requirements of NASM86 to an 8K arena with a 64 byte arena/main memory threshold.
I have used the results of this testing to lower the memory requirements of the 8086 toolchain. I will be submitting a PR there shortly.
Tested on QEMU making test.c. Not tested with chess.c - it will be interesting to see how much more memory is used.